Loading in our Libraries
Read in 2023 Sustainable Development Data with read_csv() and here()
sdr_data <- read_csv(here("data/SDR-2023-Data.csv"))
Clean column names with clean_names() from the janitor package
sdr_data <- sdr_data %>%
clean_names()
We are going to take a look at the LAC data by creating a new data frame from our original data set sdr_data, that only includes countries in Latin America and the Caribbean. We are going to rename it LAC_sdr_data wit the filter() function.
LAC_sdr_data <- sdr_data %>%
filter(regions_used_for_the_sdr == "LAC")
Now there is a data set specifically with just the LAC region data in our environment however, we want to see the total scores for each goal.
We are going to create a new data frame (LAC_sdr_scores) that only contains overall goal scores, not indicator scores, for each goal and each country in th LAC region
LAC_sdr_scores <- LAC_sdr_data %>%
select(
goal_1_score, goal_2_score, goal_3_score, goal_4_score, goal_5_score,
goal_6_score, goal_7_score, goal_8_score, goal_9_score, goal_10_score,
goal_11_score, goal_12_score, goal_13_score, goal_14_score, goal_15_score,
goal_16_score, goal_17_score
)
You should now see it in your environment bringing the variable down from 666 to 17
Now we are going to put it into ggplot with goal 3 “Good health and well being scores” We are going to see the values by inputting a geom_bar(stat = “identity”)
ggplot(LAC_sdr_data, aes(x= goal_14_score, y = country)) +
geom_bar(stat = "identity")
## Warning: Removed 8 rows containing missing values or values outside the scale range
## (`geom_bar()`).
Lets create a histogram from our LAC data set to pull SDG 14 Scores and
we are going to fill with country code so we can see the countries in
the LAC region.
ggplot(LAC_sdr_data, aes(x = goal_14_score, fill=country_code_iso3)) +
geom_histogram() +
theme_minimal()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_bin()`).
We are going to clean up the graph by using labels.
ggplot(LAC_sdr_data, aes(x = goal_14_score, fill=country_code_iso3)) +
geom_histogram() +
theme_minimal() +
labs(title = "Distributions in LAC region",
x = "SDG 14 Score",
y = "Number of Countries",
fill = "Countries")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_bin()`).
Lets make it interactive by putting in ggplotly and by adding a name to it.
We are going to label it
Goal_14_Score <- ggplot(LAC_sdr_data, aes(x = goal_14_score, fill=country_code_iso3)) +
geom_histogram() +
theme_minimal() +
scale_fill_viridis_d(option = "inferno") +
labs(title = "Life below water scores",
x = "SDG 14 Score",
y = "Number of Countries",
fill = "Countries")
ggplotly(Goal_14_Score)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_bin()`).